Digital video processing

ABSTRACT

A method of digitally processing a sequence of video frames wherein in object which appears in the frames and undergoes relative motion or transformation is selected in a first frame by an operator; the pixels relating to the object are tagged in that frame by means of information including at least one color or appearance attribute; corresponding pixels relating to the object are located automatically in subsequent frames, by means of said information including at least one color or appearance attribute and by means of information indicating the expected position or shape of the object in the subsequent frames; and the pixels relating to the object in each of the frames are procesed.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to digital video processing.

2. Description of the Related Art

In U.S. Pat. No. 5,450,500 and UK Patent Application 9407155.2 there isdisclosed a digital colour processor (DCP) for colour correction inwhich only pixels that are specifically selected to be modified areprocessed by the digital circuitry. The pixels that are not to bemodified are passed through the DCP without any processing that couldcreate rounding or other errors. The processor comprises:

input means for receiving a stream of digital pixel data representativeof pixels in a video picture;

selecting means for testing digital pixel data in said stream andselecting said data for modification if and only if it meetspredetermined selection criteria;

modifying means for modifying said selected pixel data according topredetermined modification parameters to generate modified data;

first combining means for combining said modified data from saidmodifying means with unmodified data from said input means to generateoutput data; and

output means for supplying said output data to further equipment.

In contrast, in a conventional architecture, all of the pixels in thepicture would be processed through the same signal modification path,possibly being converted from red, green and blue (RGB) to hue,saturation and luminance (HSL), and then back again to RGB, causingerrors.

An advantage of the above system is that pixels to be modified can beselected not just in accordance with colour parameters but in accordancewith other parameters such as position. Thus, only pixels within acertain area might be modified.

Pixel selection advantageously is carried out by using the architecturereferred to below as the "pixel identification table" or alternativelyas the "cache tag RAM". The pixel identification table stores digitalbits which define which pixels will be selected from the pixel streamfor modification. Pixels may be selected as a function of their color(hue) as in prior systems, and/or as a function of other criteria, suchas saturation, luminance, (X,Y) pixel coordinates, sharpness, andtexture, alone or in any combination.

Further, after a pixel or region to be changed has been isolated, otherparameters besides (H,S,L) color attributes can be changed. For example,the sharpness or even the (X,Y) coordinates of a region can be changed.Modifying the (x,y) coordinates of a region would be useful, forexample, for special effects such as moving an object in the picture.Detecting pixels according to their (X,Y) coordinates could also beuseful for copying pixels at a given x,y from one frame to another forscratch concealment. The latter process might be carried out simply by,for the given X,Y, controlling the frame store of the DCP (discussedbelow), so that those specific pixels are not overwritten from frame toframe.

SUMMARY OF THE INVENTION

It has now been determined that it is possible to use such a system formore advanced handling of objects. Thus, it is possible to identify anobject and to track it as it moves from frame to frame. It is alsopossible to store information relating to the object so thatmanipulations other than colour correction can be carried out.

According to one inventive aspect disclosed herein there is provided amethod of digitally processing a sequence of video frames wherein anobject which appears in the frames and undergoes relative motion ortransformation is selected in a first frame by an operator; the pixelsrelating to the object are tagged in that frame by means of informationincluding at least one colour or appearance attribute; correspondingpixels relating to the object are located automatically in subsequentframes, by means of said information including at least one colour orappearance attribute and by means of information indicating the expectedposition or shape of the object in the subsequent frames; and the pixelsrelating to the object in each of the frames are processed.

An appearance attribute could include texture or sharpness. Atransformation could be in the form of morphing or the like or could besimply rotation.

Processing of the pixels could consist of colour correction, or couldconsist of storing information about the object so that it could becopied to another sequence, or modified in size, or replaced by adifferent object.

The information indicating the expected position of the object in eachframe could be derived in various ways including edge detection oranother motion vector detection system. In a simple system an operatorcould draw e.g. a rectangle around the object in the first frame and asecond rectangle around the object at a different position in the lastframe. The system would then calculate by extrapolation the expectedposition of the rectangle in the intermediate frame, assuming constantspeed of movement. By positioning the rectangle manually in a fewintermediate frames more accurate results could be obtained or changingspeed or direction detected and accounted for.

It may be desired to identify more accurately the boundary of an objectand according to another inventive aspect disclosed herein, there isprovided a method of identifying the picture elements within aparticular object, comprising the steps of marking a plurality of pointson the boundary of the object, defining vectors joining the points so asto define the boundary of the object, carrying out a scanning operationso as to identify the pixels within the area, and storing the locationsof the pixels.

Additionally, account may be taken of a change in shape, for example asa car turns a corner. Starting and finishing shapes may be defined, andthe system will estimate intermediate shapes.

Providing all the pixels within the originally defined object areselected in accordance with the selected colour parameters, by selectingin subsequent frames only pixels which have those parameters and arewithin the defined boundarys, the system can exhibit considerableselectivity. For example, a slight turning of the object might requirethe boundaries to be narrowed slightly, but even if this is not done, itwill be accounted for since the object colours will remain the same andonly pixels with those colour parameters will be selected. Pixels withinthe defined boundary which do not belong to the object colour parameterswill not be selected.

Another inventive aspect disclosed herein relates to an efficient systemfor storing the properties of pixels or for storing the requiredcriteria for pixels to be selected. In this system, at least two tablesare established, each table having a plurality of locations representingrespective possible values of a property of a pixel. Appropriate valuesare stored in each table, and by simple AND logic it is possible toidentify the properties of a chose pixel. The tables could respectivelycontain H,S,L values, x,y co-ordinates and other information asdescribed herein. Such an arrangement permits efficient storage of theinformation.

In one system, each table contains information relating to a number ofpixels and each location can have a plurality of values. Thus it ispossible, by using different values and having corresponding values intwo or more tables, to tie together the total information relating to aparticular pixel.

The system described herein permit sophisticated editing techniques. Inparticular, object can be tracked and their properties altered, or theycan be lifted from a video.

It will be appreciated that protection may be sought for any of theinventive aspects discussed above, whether along or in combination.

BRIEF DESCRIPTION OF THE DRAWINGS

Some embodiments of the inventions disclosed herein will now bedescribed by way of example and with reference to the accompanyingdrawings, in which:

FIG. 1 (in parts A and B) is a block diagram showing a digital colorprocessor (DCP) which can be adapted for use in a preferred embodiment.

FIG. 2 is a block diagram showing the pixel identification table of theDCP.

FIG. 3 schematically shows the arrangement of a cache tag RAM for hue inthe pixel identification table of FIG. 2.

FIG. 4 schematically shows the arrangement of cache tag RAMs for X and Ycoordinates in the pixel identification table in FIG. 2.

FIGS. 5A and 5B show a side-by-side comparison of a cache tag RAMcontaining one bit at each memory location versus a cache tag RAMcontaining 3-bits at each memory location.

FIG. 6 is a schematic illustration of a cache tag RAM for hue values,showing a RAM divided into eight channels, each channel being arrangedfor storing data bits corresponding to a respective set or range of huevalues.

FIG. 7 is a flow chart illustrating the channel priority logic in thepixel identification table of FIG. 2.

FIG. 8 is a schematic diagram showing the use of a cache tag RAM tostore selection criteria corresponding to video textures.

FIG. 9 is a simplified illustration of the control panel of the DCP.

FIG. 10 is a schematic illustration of signal flow among variouscomponents of the DCP according to a practical embodiment thereof.

FIG. 11 shows a card rack arrangement in a practical embodiment of theDCP.

FIG. 12 is a logic diagram showing a set of cache tag RAMs correspondingto various pixel attributes for one channel, and the priority logic, inthe pixel identification table, as well as the offset register for huefor that channel within the offset table.

FIG. 13 is a schematic block diagram showing a first form of a relativeor grey-level cache tag RAM.

FIG. 14 is a schematic block diagram showing a second form of a relativeor grey-level cache tag RAM.

FIG. 15 is a more detailed block diagram corresponding to thearrangement of FIG. 14.

FIG. 16 is diagram showing how the system of FIG. 1 may be modified toaccept positional information.

FIG. 17 is a diagram showing the basic layout of a system enabling asimplified colour corrector to be used.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

I. Introduction

The DCP disclosed herein is an advanced, multi-functional all-digitalcolor corrector as disclosed in U.S. patent application Ser. No.08/045,560 and UK patent Application 9407155.2. The inputs and outputsof the DCP are preferably 10-bit digital RGB signals, that is, with tenbits used to describe each of the input red, green, and blue signals.

All components are standard. All functions and timing of the disclosedcomponents are controlled by a Motorola 56000 (or 58000) seriesmicroprocessor using conventional techniques.

High definition television requires five times the data rate andbandwidth of standard definition television. The data rate of the DCPwill match the HDTV1 output from the BTS FLH1000 telecine. By way of two50-pin D-connectors, the DCP can accept multiplexed Y/cb/cr data in the4:4:4 format. The word size is 10-bite. The DCP will accommodate amaximum clock rate of 80 MHZ. The following line standards will besupported:

    ______________________________________                                        Lines  Hz           Pixels/Line                                                                             Clock (MHz)                                     ______________________________________                                        1250   50           1920      72                                              1050   59.94        1920      72                                              1125   60           1920      74.25                                            525   59.94         720      13.5                                             625   50            720      13.5                                            ______________________________________                                    

Internal calculations will be carried out at 16-bit accuracy, which willprevent rounding errors; the results will be rounded to 10-bits at thefinal stage of the signal modification path.

Hue can be modified throughout a full 360° range in the cylindricalcolor space. Saturation and luminance can be varied from 0 to 400%. Hue,saturation and luminance can be selected with 10-bit accuracy.

Preferably, according to a preferred embodiment of the invention,digital video information, for example from the FLH1000 telecine, isdecoded and demultiplexed by means of a decoder/demultiplexer such as astandard gate array logic, or an ASIC (element 1 in FIG. 1), or anyother conventional circuit which is able to provide a 10-bit Y/Cr/Cb(YUV) data input at up to 74.25 MHz clock rate. This signal is convertedby a digital multiplication matrix 2 to provide RGB data. By changingthe coefficients within the matrix, master saturation and luminancecontrols are provided.

Reference is made to FIGS. 10 and 11. Even with the speed of recentlyavailable semiconductor devices it is not cost-effective to build signalprocessing circuitry to cope with up to 80MHz data rates. However, thearchitecture described herein can easily be broken down into a series ofblocks which can operate in parallel to achieve the desired bandwidth.

The device 1 will accept full bandwidth data through the input andoutput ports of the system. This device, together with ECL buffers, candirectly accept signals at clock speeds up to 100MHz and provideparallel (multiplexed) TTL data streams at reduced frequency.

The device 1 outputs 2 multiplexed channels, each at half the inputclock rate. In the worst case (1125-line, 50Hz) each channel willoperate at 37.125 MHz. Each of the A and B channels will be 30-bit-wideparallel data. If pixels are numbered horizontally across each TV linestarting at 00, then channel A will carry evenly numbered pixels whilechannel B will carry the odd-numbered pixels. Differential ECL driverswill be used to carry this data through the backplane to alternate pairsof identical cards. On each card a further level of multiplexing willprovide two sub-channels at 1/4 of the input clock rate (i.e., 18.6 MHzmaximum) which can then be processed efficiently by the various DCPlogic blocks. Each card will therefore carry two identical sets ofcircuitry together with the necessary MUX/DEMUX logic.

As mentioned above differential ECL drivers and receivers will be usedto carry video data through the background. This method has already beenproved reliable in commercially released Pandora color correctors, whichcarry multichannel video data multiplexed at similar speeds. ECL willalso be used onboard each card to perform the sub-MUX/DEMUX function.

The main microprocessor control channel which runs through the backplaneof the system will use BTL drive logic similar to those devicesspecified for Futurebus. This is desirable in order to achieve therequired speed and fanout. Active bus termination will be employed ateach end of the Backplane.

Standard television has 500-600 lines (625 in Europe, 525 in the UnitedStates) per frame. High definition television has more than 1,000 lines.Film-grade resolutions are equivalent to many thousands of lines(2,000-8,000 for example). With appropriate downsampling andinterpolating as described below, the DCP is capable of operating in thefollowing modes:

(a) standard definition main path, standard definition modification path(i.e., no subsampling or interpolating) in real time;

(b) high definition main path, standard definition modification path(i.e., with subsampling at 3a and interpolating at 5a) in real time;

(c) the use of the system as in (a), without subsampling orinterpolation, for non-real time processing of high definition or filmresolution video; and

(d) high definition main path, high definition modification path, inreal time, without subsampling or interpolating.

In all of these cases, the pixel depth, i.e., bits per pixel, is 10 bits(1,024 levels).

"Resolution" refers herein to the spatial resolution between lines, orpixels within a line; while "pixel depth" refers to the number of bitsper pixel.

A block diagram of the DCP is shown in FIG. 1. A primary signalprocessing path transmits the input signals to the outputs. A secondarypath forms a bypass off of the primary path, and is used to calculatethe modification signals.

II. Primary Signal Path

The primary signal path starts with the inputting of RGB digital signalsat an input, which may be respective 10-bit RGB output signals from atelecine, a VTR, or any other digital video signal.

The R,G,B digital signals may be supplied by a conventional inputarrangement as shown in FIG. 1, which comprises an input decode anddemultiplex unit 1, which receives RGB signals and outputs Y, U and Vsignals; an intermediate master saturation control which processes the Uand V signals from the unit 1; and a digital matrix 2, the latteroutputting the 10-bit R, G and B signals.

For pixels that are not intended to be modified, a completelytransparent R,G,B signal path is provided from point B (or point A ifthe LUT's 3 are not set to modify the video signal) through the outputof the convolver 7a (and further, through the LUT's 8 for legal videosignals). For use with the RGB output from the Rank Cintel 4:4:4 URSAstore, conversion between RGB and YUV is completely unnecessary. In anyevent, the conversion between YUV and RGB and vice versa at stages 1, 2,9 and 10 in FIG. 4 is essentially reversible and does not introduceerrors in the normal case.

The R, G and B signals are then provided to respective primary lookuptables 3 (LUT's). These can be used, if desired, to perform the samefunctions as the conventional gain (white level), lift (black level),and gamma (contrast and midtone) controls of the telecine. They can alsomodify color. The primary lookup tables can modify all of the pixels inthe entire picture. They can be used to perform "master" modifications(that is, modifications applied equally to the red, green, and bluechannels resulting in a tonal change, but not a color change), byapplying an identical modification to the red, green, and blue lookuptables. "Differential" modifications are accomplished by applyingmodifications to only one or two of the lookup tables. In this way it ispossible to modify, for example, the gamma of blue only.

The primary LUT's 3 are preferably dual-ported RAM's, so that they canbe loaded via one port, while the main signal processing functioncontinues via the other port. Thus the LUT's 3 are capable of beingreloaded between frames, which is known to those in this art as"dithering." Applying different LUT's to successive frames is useful fortemporally breaking up grain and noise, for example.

One reason for replicating the functionality of the telecine controlswith the primary lookup tables 3 is to be able to custom-load thesetables and thereby accomplish a degree of control not available on theconventional telecine.

The primary lookup tables 3 are not essential to this invention, but areprimarily a convenience for use in tape-to-tape transfers. They also maybe used to control the response curves of the DCP in order to, forexample, emulate a particular telecine. They are loaded by the DSP 4,which is controlled by a programmer/controller such as the POGLEcontroller described above.

The DSP (digital signal processor) 4 is a microprocessor of anyconventional design which is capable of implementing an algorithm forproducing a family of curves, dependent on the parameters of thealgorithm. For example, a family of different parabolic curves can begenerated by calculating in the DSP 4 the value of the output of thealgorithm on a step-by-step basis for each step of the lookup table. Forexample, if the DSP is programmed with the equation of a straight line,a straight line is loaded into the lookup tables 3.

The lookup tables 3 are constructed by using RAM memory which isaddressed by the input video signal. When the system is first poweredup, the processor 4 associated with each lookup table writes anincrementing series of values at each address in the RAM. In otherwords, initially, the contents at a given address equals that address.Thus, when the video signal is applied to the address input of the RAM,the data output provides exactly the same hexadecimal value and so thevideo signal passing through the RAM remains unchanged.

However, at any time, the DSP 4 may calculate a different series ofvalues to be written into the RAM for providing a translation of thevideo signal. By this means it is possible to transform any red value inan input signal, for example, into any other red output value.

At point B, after processing by the primary lookup tables 3, all of thecorrected R, G and B signals, including those that are not to bemodified, are provided (possibly downsampled) to the secondary signalpath.

If the primary signal path is high definition (HD) then it isadvantageous for the modification path to be standard definition.Therefore, the HD image is subsampled down at point B and isinterpolated up at point C. A subsampler 3a and an interpolator 5a areshown in FIG. 4. According to one simple subsampling technique, it ispossible to simply pick every other pixel and every other line in thesubsampling process, and then to replicate each pixel and each line inthe interpolating process. Also useable are more complex techniques suchas bilinear sampling and interpolation (that is, linear interpolation inboth the along-line and between-line directions); and even morecomplicated interpolators such as the Cubic-B spline technique. See,Pratt at pages 113-116, incorporated by reference.

According to one example of a technique of bilinear sampling andinterpolation, the subsampler 3a could interpolate down, for example, byaveraging 2×2 arrays of pixels in a high definition 1,000-line pictureand converting those four pixels to one standard-definition pixel, thatis, a four-fold data reduction. The resulting data rate will be 1/4 ofthe high definition rate, that is, approximately the standard definitionrate. Correspondingly, after processing by the signal modification path,the interpolator 5a would interpolate up by, for example, bilinearinterpolation between adjacent ones (in both the X and Y directions) ofthe standard-definition pixels that have been processed in the signalmodification path.

Next along the primary signal path is a digital delay 5, which maycomprise one or more delay lines. In the disclosed embodiment, the delay5 provides a time delay of two lines (2L). This delay gives enough timefor the secondary signal path to calculate the modification signals,which will be used to modify the original signals.

After the delay 5, the modification signals are brought back into themain signal path at point C, and combined, for example by interpolation(upsampling) by an interpolator 5a, with the unmodified main signals bycorrection adders 6. The output signals from the adders 6 then form the10-bit red, green, and blue modified digital outputs which are thenfiltered by the convolver 7a and subsequently outputted from the DCP.

Element 7 is a buffer frame store, which provides resynchronization ofthe video signal after processing. Video delay through the system isthereby arranged to be one TV frame. H and V output timing adjustments(not shown) are also provided.

The logic used to provide both primary and secondary color correction ispipelined for reasons of speed. This means that the signal at the outputof the adders 6 is delayed in time relative to the input. In a studioenvironment this time delay would have disastrous consequences and sothe frame store 7 is used to re-syncronise the video output.

This frame store 7 is constructed using RAM configured as a FIFO memory,whereby data is written in using the input clock which has been passedthrough the pipelined stages discussed above. Some time later aregenerated clock locked to station reference is used to read out thedata in a similar order, but in synchronisation with system line andfield timing.

A second frame store (not shown) is provided in parallel with the aboveto allow the operator to store a single frame of video for colormatching. The input to this second frame store is switchable fromvarious stages throughout the processing circuitry, allowing thecolorist to compare between a corrected and uncorrected image. Theoutput of the second frame store may be overlaid on the main videooutput by means of a WIPE control.

Convolver 7a receives the output of the frame store 7 and smooths itaccording to a known convolution scheme which is discussed below inconnection with the convolver 17.

The convolver 7a output is passed to a lookup table 8 which is used toperform a clipping function on the RGB signal in order to ensure thatimproper or illegal values are not passed to the output. The lookuptable may be updated at any time by a microprocessor (not shown),allowing different algorithms for clipping to be selected.

Element 8 is a clipping lookup table, which clips signals at theirmaximum or minimum values to prevent "roll-around", i.e. to keep theoutput within the 0-255 (8-bit) range. In addition, it is normallynecessary to restrict the output to the 12-240 range, as required by theSMPTE standard for digital video, the reserved areas above and belowthis range being used for blanking and special effects. The LUT's 8 maybe reconfigured under software control (not shown) to select either"hard-" or "soft-edged" clipping algorithms.

Finally the RGB signal is re-converted to YUV color space before beingpassed to the output multiplexer and the line driver stage. Use may bemade once more of the above-discussed decoder/demultiplexer (in reverse)or an equivalent device to reconstruct the output signal in a similarformat to that originally derived from the FLH1000 or other telecine.

Element 9 is a matrix which converts, if necessary, from the RGBenvironment to a YUV (luminance-chrominance) environment. Finally, inthe output encoder and multiplexer 10 the YUV signals are conventionallyencoded and multiplexed for, e.g., broadcasting or recording.

III. Secondary (Modification) Signal Path

In the secondary signal path, the DCP produces respective modificationsignals for those pixels, and only for those pixels, which have thecriteria indicating that they are to be modified.

A. Signal Conversion to (H,S,L)

The first step in the modification signal path is a digital matrix andcoordinate translator unit 11 which converts the red, green, and bluesignals into signals representing hue, saturation, and luminance. Thereare several commercial chips which can perform this function by use ofpublic-domain algorithms. In this case, the matrix first providesconversion from (R,G,B) to (Y,U,V). The Y signal becomes the final Lsignal. The coordinate translator converts from Cartesian coordinates(U,V) to cylindrical polar coordinates (H,S), by means of a lookup tablein PROM.

Transformation from R,G,B signals into cylindrical color space (H,S,L)is described, for example, in R.W.G. Hunt, The Reproduction of Color inPhotography, Printing, and Television (Fountain Press, Tolworth,England, 4th ed. 1987), ISBN 0-85242-356-X, at 114-121, incorporated byreference. In cylindrical color space, luminance is conventionally shownas a vertical axis. Planes which intersect this axis at right angles areplanes of constant luminance. Planes of constant hue extend radially outfrom the axis, each having a hue angle. Saturation or amount of color isrepresented by the distance from the axis; thus, at the luminance axisthere is no color.

One possible hardware implementation, incorporated by reference herein,utilizes first the TRW model TMC2272 chip, which transforms the incomingRGB to YUV, which is a color space comprising luminance (Y) and twomutually orthogonal chrominance vectors U and V. The second stage is theTRW model TMC2330 chip, which mathematically transforms from rectangular(Y,U,V) to polar coordinates (H,S,L). Both of these chips are alsousable in the reverse direction for conversion from HSL to YUV to RGB.

H,S,L color space is conceptually convenient to use in practice. Incontrast, the U and V vectors are difficult to imagine. The conversionfrom YUV to RGB to HSL is in two stages for convenience, as standardchips are readily available to do these two conversions, while nostandard chip is readily available for converting directly from YUV toHSL. On the other hand, three-dimensional RGB color space is essentiallycubical and therefore, it is advantageous to carry out the clippingfunctions by the LUT's 8 (and also the master transformations by theLUT's 3) in RGB space.

B. Pixel identification Table

Following the conversion to H, S, and L, selected boundary conditions inthis color space are inputted under operator control into a pixelidentification table 15, which distinguishes the region of color spaceto be modified, from the region not to be modified. This technique willbe referred to herein as "cache tagging". It involves defining a rangeof data bounded by particular data values. These ranges are "tagged" ina "cache tag RAM" (described below) for later use. As shown in FIG. 1,X, Y and T tags may be employed. At least H, S and L "tags" are employedin the preferred embodiment.

For each pixel, it is determined whether to "modify" or "not modify"that pixel by taking the logical AND of the output bits from the H,S,L,etc., cache tag RAMs, which are located with the predetermined criteriafor selecting which pixels in the input signal are to be modified. Ifall of the output bits are "1", that will indicate that for that pixel,a modification signal will be generated, which will be added back intothe main signal path later on.

As an example of this process, the DCP is capable of tagging only verynarrow range of reds, which might be useful, for example, to improve thecolor of one piece of red clothing. All other "reds", and other colorsincluding black, white and grey, remain untouched. By the same process,all of the colors in a multicolored piece of clothing can be selectedsimultaneously.

Advantageously, there is also a "master hue" or "wash" mode, wherein allof the pixels in the picture are marked to be changed, and then the huesor other attributes of all the pixels can be changed simultaneously.

"X" and "Y" tags can also be used in the cache tag RAMs, in order torepresent the boundaries spatially (where X and Y are pixel and lineaddresses). X and Y tags can be inputted, as shown in FIG. 1, by directentry of X and Y addresses into the pixel identification table 15.

X and Y coordinates of a particular pixel are determined from the studiosynch signal as seen at the left-hand portion of FIG. 1. Lines arecounted by a line counter and pixels within each line are counted by apixel counter. The line counter and pixel counter have outputscorresponding respectively to the Y and X coordinates of each pixel. Thesynch signal contains a frame reset signal which corresponds to X=0, Y=0followed by a series of pulses for incrementing X, followed by a linereset signal (which resets X to 0 and increments the line number Y). TheX and Y signals are made available to the pixel identification table 15.The availability of the X and Y coordinates of each pixel enablesprocessing of each pixel "on the fly" in a very simple manner.

Alternatively, a conventional key input channel to the pixelidentification table 15 is essentially a substitute for the cache tagRAM. (It could also be used in tandem with the X and Y tag RAM.) Aconventional key input signal can be applied to the key input channeland ANDed with the H, S and L table outputs to control directly when theoffsets from the offset table 16 are to be applied to a given pixel. Asis conventional, the DCP and the source of the key input signal arecoordinated by the common sync signal to which all of the studioequipment is normally connected. The key signal, again as isconventional, is a black-on-white or white-on-black picture signal(which may be rectangular or have any other shape) which can be used tocontrol further equipment such as the offset table 16. A key inputsignal can be generated by a vision mixer such as the Abekas A84 andmany other devices.

Also as seen in FIG. 1, the pixel identification table 15 can beemployed to indicate selected pixels by means of a conventional keyoutput signal on a key output channel, for controlling furtherequipment, rather than to select offsets from the offset table 15.

FIGS. 2 and 3 show the structure of the pixel identification table 15 inmore detail. It comprises a cache tag RAM (or CTR) 150, which in thisembodiment comprises at least hue, saturation, and luminance RAMs 150H,150S and 150L, respectively. These may be supplemented by X and Y RAMs15OX, 150Y. Hue, saturation, and luminance components of each pixel aresupplied by the digital matrix and coordinate translator 11 at point Das described above. Select signals SH, SS and SL are provided by thecontrol panel or by a controller such as the POGLE controller andprovide data to be entered into the RAMs 150H, 150S and 150Lrespectively, to indicate how pixels are to be selected for modificationaccording to their hue, saturation and luminance (and optionally SX andSY signals, and/or other signals representing sharpness, texture, oranother parameter). The entered selection criteria distinguish theregions to be modified from the regions not to be modified, and togenerate control signals according to predetermined standards to controlthe DCP. The RAMs 150H, etc., will be described further below in moredetail.

By means of a cursor, the operator of the DCP can point on a screen to aparticular color, or to a physical region containing one or many colors.The programmer/controller will then provide (H,S,L) data and optionally(X,Y) or other data to the pixel identification table.

There are a plurality of channels (for example, 8 channels) each havinga set of cache tag RAMs 150 which can thereby specify 8 modificationsets. For example, 8 objects in a picture can be modified if they can bespecified uniquely by a combination of H, S and L data, optionallysupplemented by X and Y data, for example. The RAMs 150H, 150S and 150Lare each 1K RAMs, i.e., RAMs having 1,024 address locationscorresponding to a 10-bit input address. The CTR's can be implemented bystandard components or by an application-specific integrated circuit(ASIC). By means of such RAMs, 1,024 individual degrees of hue,saturation and luminance can be defined. Thus, 3K bits can control 2³⁰(or 1,073,741,824) possible colors. Great color resolution can becontrolled by means of a minimal amount of data.

FIG. 3 is a schematic diagram indicating a possible implementation ofthe hue CTR 150H. As an example, the bottom third of the addresses inRAM 150H could be designated to correspond to respective shades of red.The middle third could correspond to shades of green, and the top thirdof the addresses in RAM 150H could be designated to correspond to shadesof blue. These designation are indicated by the letters B and R on theleft side of FIG. 3. As seen therein, bits 4-13 are loaded with thevalue "1" and the rest of the bits are loaded with "0." Thus, a narrowrange of shades of red that have been defined to correspond to bits 4-13are being selected for modification. Every pixel is applied as anaddress to the hue CTR 150H in the pixel identification table 15. If apixel's hue is binary 4 to 13 the output of the CTR 150 H will be 1,indicating that that pixel has a hue in that range of red shades. Thosepixels will be modified according to a predetermined modification storedfor that channel in the offset table 16.

If, in the preceding example, a pixel with that specific shade of red isto be selected regardless of its saturation and luminance, then the Sand L RAM's 150S and 150L are loaded completely with 1's.

The H, S, and L table contents for a particular pixel are ANDed todetermine whether that pixel will be selected for modification. Forexample, all pixels of a given hue, irrespective of the S and L, can beselected by loading selected locations in the H table with ones, and allof the S locations and all of the L locations with ones. Or, onlycertain combinations of H, S and L can be selected by only fillingportions of each table, which need not be continuous, with ones. Thecache tag RAM signals are ANDed, and therefore, only if all of thecriteria (H, S, L, X, Y, and any other criteria being used) are met,will that pixel be tagged for alteration.

Advantageously, there will be a macro feature on the controller to carryout any routine series of loading functions, such as, for example,setting up the DCP to select pixels of given hues, automatically loadingall of the S and L locations with ones in order to disregard saturationand luminance.

In practice, it has been found advantageous for there to be defaultsettings for the H, S and L tables. By default, all luminance values areselected by filling all locations in the L table with ones. Channels 1-6are each loaded with 1/6 of the hue range. The top 95% of the saturationrange is loaded with ones, in order to select substantially all colors,but not neutrals (which have zero saturation).

FIG. 4 shows a possible implementation of CTRs 150X and 150Y, whichagain are 1K RAMs. These two RAMs can be used to designate individualpixels, or rectangular regions that occur at intersections of X and Yranges. The Y and X locations correspond respectively to lines andlocations within lines in the picture. Controlling spatial regions of apicture for modification with 1K RAMs for the X and Y coordinates is notas powerful a technique as using a 2-dimensional address memory, forexample, but it is almost as useful and is still very powerful becauseagain, with only 2K bits of data, one million distinct pixel locationscan be designated. Thus, by this technique, the DCP can delineate, forexample, a rectangular region of the picture to be modified (or notmodified).

As an example of cache tagging, let us consider the example where wewish to modify all pixels in the picture with a "mid-range" value ofluminance. In this example, the control panel will interpret itssettings as an instruction to change pixels which have any value of hue,and any value of saturation, but a luminance value greater than a lowerthreshold L1, and less than an upper threshold L2. This will cause theluminance tag RAM to be loaded with zeroes for the possible 10-bitvalues from 0 to L1. For example, if L1 is 256 (one-quarter range) andL2 is 768 (three-quarters range) then the first 256 values of the Lcache RAM will be loaded with zeroes ("do not modify"). The addresses257 to 767 will all be loaded with the value "1" ("modify"). Theremainder of the cache tag RAM addresses (addressed 0 to 256 and 768 to1023) will be loaded with zero ("do not modify").

It can be seen from this simple example that we can distinguish by thistechnique between any region in color space and any other region. Evenif two regions have the same hue, they can be distinguished on the basisof luminance or saturation. In accordance with one of the inventiveaspects disclosed herein, for more complex cases, one can distinguish bylogical combinations of H, S, and L limits and X and Y addresses. Notethat a range of a single parameter or a region of colors need not becontiguous. Thus, if 157 non-consecutive values of hue were to bemodified, at those 157 hue-valued addresses in the hue cache tag RAM,there would be a "1". This demonstrates the enormous resolving power ofthe cache tag system.

As mentioned above, the architecture of the DCP provides for a pluralityof independent channels. For example, 6, 8 or 10 channels may besufficient for most purposes. FIG. 6 schematically shows 8 channels.Thus there can be eight "channels" with respective pixel identificationtables 15, which are able to modify eight separately defined regions,colors, luminance ranges, etc. These regions can overlap.

In practice, all 8 channels of hue, for example, can be implemented withone 8K hue RAM. The hue RAM has 8 bits of data at each address, each bitcorresponding to one hue address for one of the 8 modification channels.

The Hue CTR is structured in bytes as is normal for memories. Each bitof the 8-bit byte corresponds to one channel and represents either"select" or "not select" the particular hue which corresponds to thatbyte for that particular channel.

FIG. 6 shows the hue CTR in greater detail. FIG. 6 shows an 8K RAM where8 channels (1H-8H) have been designated having 1K (1024) bits each. Thiswill be presumed to be the H CTR, the S and L CTR's being identical. Agiven 10-bit H value of an input pixel is inputted to the CTR 150 H andis used as an address for addressing all 8 of the channels. If, forchannel 1, the H, S and L CTR's all have 1 at a particular addresscorresponding to the input pixel's H, S and L value, then that pixel issaid to have been "tagged" for alteration. Then, for that channel, theΔH, ΔS and ΔL which have been preloaded, will be supplied from theoffset table 16.

C. Priority Logic

The DCP pixel identification table 15 contains precedence logic 250 toresolve internal conflicts between channels about color modification.Many such conflicts will be avoided by the use of X and Y cache tag RAMsto specify a specific physical object whose color is to be modified, buteven then, a conflict will occur when moving objects find themselvestemporarily in the same X,Y region. To implement the priority logic,highest priority may be given to lower-numbered channels, and lowerpriority to higher-numbered channels. This is not a restriction onoperational flexibility, as channels can be renumbered at will.

As an example, it might be desired to modify a particular red shade whenit occurs in a traffic signal in a given scene, but not when it occursin the clothing of a person walking near that traffic signal. Thesolution would be to give priority to a channel which specifies bothclothing color and location, so that the red shade will not be modifiedunless it is at the proper location.

As another example, if it were required to make an image go monochrome,except for the reds in the picture, one channel of the DCP could be usedto make all of the picture monochrome. Then, a second channel would beselected to identify reds, to prevent them from going monochrome. Thesecond channel would be designated to have priority over the firstchannel.

Channel 1 is always the highest priority channel. An input pixel, forexample, is applied first to channel 1. However, a given priorityhierarchy can easily be modified by swapping the content of each channelwith that of any other channel. The channel numbers assigned to each 1Kbit array in the CTR are internally controlled and managed within theDCP.

The priority logic is shown in more detail in FIG. 7. For example, ifchannel 1 has been loaded to tag red colors and change them to blue, andchannel 6 has been loaded to tag light colors and change them to dark, apixel with a light red color will be corrected and changed to light blueby channel 1. It will not be corrected by channel 6, because channel 1has priority. If the operator does not like this result he can reorderthe priority by swapping the contents of channels 1 and 6. Channel 1will become the light color channel and channel 6 will become the redchannel. Thus, a light red will now be controlled by channel 1 andchanged to dark red.

D. Texture and Sharpness Detection

The DCP can also sense and respond to texture. Texture can be sensed onthe basis of an analysis of luminance values according to standardmethods as described, for example, in Pratt at 503-511, incorporated byreference. Texture is detected by analyzing the luminance data in aseries of lines in the pixel by known methods, according to criteriasuch as spatial frequency and randomness. A spatial correlation functionis defined in equation 17.8-1 on page 506 in Pratt. No one pixel candefine a texture. A group of pixels is needed to define a texture. Prattrecommends a window of about 6×6 pixels to define texture, at page 507.

Likewise, sharpness can be detected even more simply. Page 319-325 ofPratt displays a method for detecting sharpness. Simply described,looking at a 3×3 window of pixels, if all of the pixels are similar toone another, the area is not very sharp, whereas if there is a largedifference between one area of the window and another, then that area isconsidered sharp.

FIG. 8 shows an alternative cache tag RAM which can be set up forresponding to texture. Address ranges in the RAM are arbitrarilyassigned to correspond to different types of texture.

As seen in FIG. 1, luminance data are loaded into a multi-line store 11aand then the data in the store 1a are analyzed by a texture evaluator11b. The time delay provided by the delay 5 is adjusted to accommodatethe cycle time of the store 11a. Depending on what texture is detected,a predetermined 10-bit word can be outputted as a T signal to the pixelidentification table 15. If the output of the texture evaluator 11b is,for example, binary 512, indicating a brick-like texture, then when thatword is applied as an address to the texture RAM shown in FIG. 8, a 0 isfound to be entered at address 512. Therefore, the particular group ofpixels being analyzed for their texture will not be selected for anymodifications. On the other hand, if the texture of paving stone isdetected, then an output number, for example, binary 256 will beoutputted to the pixel identification table 15. As seen in FIG. 11,address 256 has a "1". Therefore, the output from the texture RAM willbe 1. This output is "ANDed" with the respective output of the H, S, L,X and Y RAMs, and if the ANDed result is "true" then it is determinedthat the pixels then being detected for texture have been tagged formodification.

E. User Interface

The user interface of the DCP is designed for "user-friendliness". Wheninitially turned on, it emulates prior 6-vector secondary colorcorrectors such as the RCA Chromacomp, which merely give the operatordirect control over the relative proportions of six specific colors, theprimaries (red, green and blue) and the secondaries (cyan, magenta andyellow). Operators have come to think in terms of those six colors. TheDa Vinci gave greater control, dividing the color circle into 16 hues,but still, the Da Vinci controlled only hue. In contrast, the DCP alsocontrols luminance, saturation and other picture attributes.

To make the DCP more user-friendly, its user interface initiallydisplays six channels configured as a standard six-channel colorcorrector. In contrast to the standard corrector, however, the locationsof the six vectors are not fixed, but rather can be "steered" so thateach of the six channels can control any desired color. In the preferredembodiment, two additional channels are provided as well, giving a totalof eight channels, although those last two channels are not initiallydisplayed, but instead may be activated manually.

For example, the initially displayed red, magenta and yellow channelscould all be "steered" to control three different shades of red. Theoperator might then wish to use the additional channels 7 and 8 tocontrol magenta and yellow.

The control panel 300 of the DCP is shown in FIG. 9. As seen therein,there is an electroluminescent display panel 302, which may be a coloror monochrome liquid crystal display. The EL panel 302 displays thecurrent selected parameters. Preferably the EL panel 302 is alsotouch-sensitive. The control panel 300 can be used in a free-standingmode to manipulate color and the other parameters that the DCP operateson. However, as in most post-production devices, the usual mode ofoperation will be under the control of a programmer/controller such asthe POGLE.

A group of six buttons 304 correspond to the six channels that areinitially available according to the preferred embodiment of theinvention. A group of dials 306 (preferably rotary encoders) areprovided for setting the upper boundaries of selected H, S, or L ranges,while a second group of dials 308 are provided for setting thecorresponding lower bounds of the selected ranges. Extra dials areprovided which can be set for detecting sharpness, location, texture,etc. output H, S, L controls 310 are also provided to set, e.g., theamount of correction to be applied to H,S,L or another attribute.

A trackball 312 is a universal device which can point and click on anymenu option. All of the above functions, including those that correspondto control buttons, are also accessible by means of the trackball, aswell as via the touch screen 302 when the menu options are displayed onscreen. Likewise, the trackball and/or touch screen are used to controlthe seventh and eighth channels for X and Y information. A reset buttonR is also seen in FIG. 12.

F. Relative Tag RAM

A modification of the disclosed architecture would have a relative or"grey" tag RAM, instead of "binary". Instead of the disclosedarchitecture (FIG. 5A), wherein the cache tag RAM provides a binarylookup for each channel, giving the limited capability of tagging colorsto "modify" or "not modify," there would be a relative or "grey" value(FIG. 5B), for example in a range of binary 0-8, at each location in theH, S and L offset tables. Relative modifications would help to avoid thepossibility of a discontinuity at a boundary between colors that aremodified and not modified (in the absence of a convolver or some otherfacility for smoothing the boundary).

The grey level cache tag RAM would avoid such a discontinuity, bymarking each specific shade with an indication of how much it is to bemodified. For example, mid-reds could be tagged to be strongly modified,while light and dark reds would be tagged for a slight modification.This would improve the naturalness of the resulting picture.

FIG. 13 illustrates the operation of a relative or "grey-level" cachetag RAM of the type shown schematically in FIG. 8B.

By comparison, the preferred embodiment, as shown in FIGS. 1, 2, 5A, and12, for example, employs a binary or "tag/no tag" RAM. Pixels are eithertagged for alteration or they are not tagged. Thus the output of the ANDgate in FIG. 12 is either a 1 or a 0.

FIG. 12 shows the respective single-bit H,S,L,X and Y RAM's (lookuptables) 150H, 150S, . . . , that are part of a single channel N of theDCP. For a given pixel, the respective H,S,L,X and Y data for that pixelare applied to the lookup tables of channel N, and the outputs thereofare ANDed by an AND gate 151. Assuming that channel N is given priorityby the priority logic (FIGS. 2 and 7), then the respective offsets inthe offset data registers N, corresponding to channel N, will beoutputted to the combiners 12. Only the ΔH offset register N is shown inFIG. 12. The contents of the ΔH offset register N are not modified inany way.

According to the variation in FIG. 13, in contrast with FIG. 12, aspectrum of light, medium and heavy tagging and in-between levels isprovided. The H,S,L,X and Y registers tag with a byte rather than a bit,for example a 3-bit byte as shown in FIG. 5B. The outputs of therespective registers in response to a given input pixel may vary frombinary 0 to 7. These outputs are added by the adder 151' to obtain asmoothly variable modulation signal. The content of the offset register16 for the corresponding channel is, for example, a constant and ismultiplied at 152' by the modulation signal to obtain the output ΔH forthe combiners 12.

A further, more complex variation is seen in FIG. 14, withcross-modulation of signal attributes. The H,S,L,X,Y registers 150' andthe channel offset register 16 for channel N are the same as those inFIG. 13. However, the constant output of the offset register 16 iscombined with the outputs of the registers 150' by a plurality ofmultipliers 152H, 152S, 152L, 152X, 152Y which are arranged in series.

The embodiments of FIGS. 13 and 14 enable the DCP to modulate thereplacement hue, for example, as a function of saturation, luminance,etc. The embodiment of FIG. 14 can be expected to give finer control.

For example, in the binary pixel identification table of FIG. 12, acertain range of red hues may be selected for alteration, and other redhues will not be altered. If high ranges of saturation and luminanceparameters are also selected, then since the respective hue, saturationand luminance RAM outputs are ANDed, that given range of red hues willbe selected and altered only when they have, for example, high luminanceand high saturation.

In contrast, in the relative or grey level cache tag RAMs in FIGS. 13and 14, it is possible not merely to modify or not modify, but rather,to apply light, medium, or heavy modifications, or levels in between.The relative output values from the luminance and saturation RAMs inFIG. 13 will be added with the hue output value, and the resultingsignal will be used to modify the contents of the offset register 16.The embodiment of FIG. 16 is somewhat less expensive, in that only oneadder 151' and one multiplier 152' are required. On the other hand, theembodiment of FIG. 14 is more expensive, requiring at least fivemultipliers, but is mathematically appropriate and can be expected togive finer control

FIG. 15 shows the hardware implementation of the embodiment of FIG. 14in more detail.

G. Offset Table

Having identified the regions to modify and not to modify with the pixelidentification table 15, the amount those regions are to be modified isthen specified by the offset table 16, which in this embodiment of theinvention provides respective H, S, and L offsets. See FIG. 12. Theoffset table 16 is a series of registers which are addressed by theoutput from the CTR 15, only one being shown in FIG. 12.

The offset RAMs hold an H offset, S offset, and L offset for each H, S,and L value to be modified in each channel. The starting value in eachregister for each channel is zero. The operator can increase or decreasethese values by means of rotary controls.

As a simple example, assume that a video scene contains two differentlycolored objects, for example a red car and a yellow car. It may bedesired to change the yellow car to red, to match the color of the redcar. The operator specifies the channel in which the hue of the yellowcar is to be stored, and identifies the yellow car by storing the exacthue value of the yellow car within the pixel identification table 15. Inpractice, the operator can position a cursor on the yellow car and thehue of the yellow car will be stored automatically. Then, the operatorinputs an appropriate color offset in the channel of the offset table 16corresponding to the yellow car's channel in the pixel identificationtable 15. To do this, the operator selects the channel of the yellow carand rotates an appropriate control for incrementing or decrementing thehue register for that channel, until the correct value is found, suchthat the color of the yellow car now matches the color of the red car.Then the operator manually stores that offset value in the offset table16 with a control on the DCP. Having set up the pixel identificationtable 15 and the offset register 16 in this way, the circuitry willreact to each pixel having the identified yellow hue that appears withinthe video signal, causing the stored offset for that channel to be addedto the hue value of that pixel by the adder 12. Thus the output of thisstage will cause all pixels having that yellow hue to be changed to redpixels.

X and Y offsets can also be supplied if it is desired to modify the Xand Y coordinates of a pixel.

The offset table can also be employed to modify the sharpness of aregion, for example in response to a particular texture or sharpnessdetected by the texture evaluator. For that purpose, the offset table 16would be loaded with appropriate data for setting the convolver 7a tomodify the sharpness of that region, according to a known process. Sucha process is performed by known Rank Cintel telecines, which employ asingle number as an input to a convolver to control sharpness orsoftness.

One advantageous use of sharpness modifications by means of theconvolver 7a might be as follows. It might be necessary to remove the"beating" of a high-frequency, shiny car radiator grill. It would bepossible to detect the region to be altered (the car radiator) by itshigh luminance. Then, having detected that region, the convolver 7awould be supplied with data from the offset table 16 causing it tomodify the sharpness of that region, to blur it slightly and remove thebeating effect from the final picture.

As a further improvement on the foregoing example, it would be possibleto select the radiator but avoid inadvertently selecting the sky, whichalso has high luminance. By ANDing the sharpness parameter and theluminance parameter, the car radiator would be selected, because it hasboth high luminance and high sharpness; but high-luminance,low-sharpness regions such as the sky would not be selected.

In other words, the pixel identification table 15 is loaded with datafor a given channel to identify regions of high luminance. The offsettable 16 is loaded with a parameter to control the degree of smoothingapplied to those regions for that given channel, and that parameter issupplied to the convolver 7a and employed for smoothing the output ofthe DCP. In this example, it is only desirable to smooth certain areasof the picture, namely those areas that have been selected formodification. It would be undesirable to smooth the entire picture,which would make it look soft and lose picture detail. Therefore, theconvolver 7a is only activated for those regions that have been selectedfor modification. To summarize, the pixel identification table 15selects where to convolve, while the offset table 16 controls how muchto convolve.

H. Signal Modification

These offsets are then combined with the original H, S, and L values ofthe original (possibly downsampled) signal by means of combiners 12, byspecific rules; namely, H offsets are added, while S and L offsets aremultiplied. Although these rules are not absolutely necessary to carryout the invention, it has been found experimentally that following theserules gives the most natural appearance. H corresponds to a phase angleor vector, while S and L are magnitudes. So, for example, multiplyingeither S or L by a modification factor of +N % will give the sameapparent degree of modification for both large and small starting valuesof S and L, which is desirable. On the other hand, since H is a phaseangle, the H modification amount should be added, not multiplied, inorder to obtain results independent of the starting value.

The resultant modified H, S and L signals are then converted to modifiedred, green, and blue signals R', G' and B' by a digital matrix andcoordinate translator 13.

At this point, this modified signal could be sent direct to the output.However, that would be undesirable for at least two reasons. First, theentire video signal has been processed twice, to convert from RGB to YUVto HSL to YUV and back to RGB. Even with the use of digital circuitry,there are cumulative mathematical errors within this process that wouldcause distortion to the entire video signal, whether or notcolor-corrected. Second, as the color correction has been performed inHSL color space, it is possible that illegal combinations of color mayhave been introduced into the RGB signal after passing through theoutput matrix. In order to overcome these problems a further processingstage is used.

The output of the unit 13 is provided to a combiner 14. The combiner 14compares the newly modified RGB signal to the original RGB signal thathas not passed through the processing loop. The combiner looks fordifferences between the two signals and performs two separate functionssimultaneously: (a) the combiner has knowledge of which pixels shouldhave been modified, by checking the output of the pixel identificationtable 15. It therefore assumes that any differences, if no channel wasselected, are due to mathematical errors and these can therefore beremoved; and (b) the modified RGB signal (and with mathematical errorsremoved) is subtracted from the original RGB signal to produce an errorsignal.

The combiner 14 takes the values R',G',B' and subtracts from them theoriginal R,G and B from point B (or vice versa), to obtain modificationsignals ΔR, ΔG and ΔB. The modifications signals are then compared withthe original ΔH, ΔS Δand ΔL by the combiner 14 so as to avoid unintendedmodifications. For example, if the offset signals ΔH, ΔS and ΔL arezero, then no modification was intended and the ΔR, ΔG and ΔB outputsfrom the combiner 14 are forced to zero. It is assumed that if they arenon-zero, that is merely the result of small limited-precisionmathematical errors that arose in the units 11-13 or elsewhere. Thisfeature contributes substantially to keeping input signals free fromcorruption if they are not intended to be modified. Known ESCC's wouldpropagate such small errors, resulting in slight modifications of thepicture where none was intended.

It is these error signals ΔR, ΔG, ΔB which are used to modify theoriginal clean RGB signal which is timed to then be passing through thedelay stage 5.

Then these modification signals are applied to a convolver 17. Awell-known convolution technique that may be employed is disclosed inWilliam K. Pratt's book Digital Image Processing (John Wiley & Sons1978), ISBN 0-471-01888-0, at 319 and 322-25, incorporated by reference.In the disclosed technique, which is only one of many available forremoving noise or artifacts, each pixel is sequentially examined, andplaced at the center of an imaginary 3×3, 5×5, or similar pixel array.If the L, for example, of the pixel is mathematically greater than theaverage of its immediate neighbors by some threshold level, it isreplaced by the average value.

The present system is particularly concerned with identifying complexareas within which pixels may be modified. As a first step it isnecessary to define the shape and identify the pixels within it. Asimple way of doing this is to mark critical points on the shape, whichcan then be joined up to define the outline of the shape.

There thus exists a need to store the co-ordinates of a number of pointsin two dimensional space which will be used in defining a two complexdimensional shape. One possibility would be to set up an area of memorywhich has a location for every possible point. Thus, if the area inwhich the shape may appear occupies 256×256 pixels there will be 65536locations, one for each pixel. Each location will contain a 0/1 valueindicating whether or not the pixel is relevant. this may be areasonably powerful technique but it has high memory requirements.

There is thus proposed an arrangement in which two tables are set up,one having a location for each of the possible x co-ordinates and onehaving a location for each of the possible y co-ordinates. Thus, in theexample above there would be an x co-ordinate table with 256 locationsand a y co-ordinate table also having 256 locations. This is the cachetag ram system described earlier.

In a simple arrangement described earlier where it is desired to defineonly a rectangle, there will be 1's in each of the x co-ordinate and yco-ordinate locations within the rectangle, and 0's in each of the otherlocations. The following table shows an example.

                  TABLE 1                                                         ______________________________________                                        x location                                                                              value        y location                                                                             value                                         ______________________________________                                        0         0            0        0                                             1         0            1        0                                             2         1            2        0                                             3         1            3        1                                             4         1            4        1                                             5         1            5        1                                             6         1            6        0                                             7         0            7        0                                             8         0            8        0                                             ______________________________________                                    

By carrying logical AND operations it will be established that theregion selected has x co-ordinates between 2 and 6, and y co-ordinatesbetween 3 and 5. This defines a rectangle which is 4×2 units andpositioned centrally of the overall space being considered.

Such a system is however limited to rectangles. Whilst that will beadequate for many situations, it is often desired to define more complexshapes.

To define more complex shapes it is necessary to store the locations ofdiscrete points. It is possible to store the co-ordinates of a singlepoint in the tables referred to above but if three points--the minimumnumber to define an enclosed area of space--are to be stored thenproblems arise. Each table would have three x locations tagged and threey locations tagged but there would be no way of determining which xlocation is associated with which y location.

The tables mentioned above are purely binary, storing the values 0 or 1.However, it is possible to establish tables which can store a number ofbits in each location, thus enabling a larger number of values to bestored. Thus, for example, an 8 bit capability would enable the storageof 256 values, i.e. 0 to 255. By storing a specified value in an xco-ordinate location and the same value in the associated y co-ordinatelocation, the co-ordinates can be linked.

By way of explanation, the table below shows how it would be possible todefine a triangle:

                  TABLE 2                                                         ______________________________________                                        x location                                                                              value        y location                                                                             value                                         ______________________________________                                        0         0            0        0                                             1         1            1        0                                             2         0            2        1                                             3         0            3        0                                             4         2            4        3                                             5         0            5        0                                             6         3            6        0                                             7         0            7        2                                             8         0            8        0                                             ______________________________________                                    

The triangle has its corners at 1,2; 4,7; and 6,4.

There is however a further problem. With larger numbers of co-ordinatesit is not possible to determine the order in which they should be joinedup to define the desired shape. Quite different shapes can be defined bypassing between co-ordinates in different orders. This problem isovercome by assigning to each chosen co-ordinate a number which not onlylinks it to its partner but also defines the order in which lines wouldbe drawn between the co-ordinates. The following table shows how foursets of co-ordinates can define a quadrilateral. ##EQU1##

The quadrilateral has its corners at 2,2; 7,3; 8,8; and 3,7.

However, by maintaining the same points but altering the order adifferent shape can be obtained. Thus by altering the order of the 8,8and 3,7 points, two triangles will be defined by means of intersectingdiagonals. This is shown in the following table, where the order ofthese two points has been swapped.

                  TABLE 4                                                         ______________________________________                                        x location                                                                              value        y location                                                                             value                                         ______________________________________                                        0         0            0        0                                             1         0            1        0                                             2         1            2        1                                             3         3            3        2                                             4         0            4        0                                             5         0            5        0                                             6         0            6        0                                             7         2            7        3                                             8         4            8        4                                             ______________________________________                                    

It will be appreciated that this particular way of using table onlypermits any co-ordinate to be used once. Thus, a square defined by theco-ordinates 2,2; 7,2; 7,7; 2,7 cannot be set out as each x and yco-ordinate is used twice. for example, x location 7 cannot be definedin the table as having order number 2 with y co-ordinate 2, and ordernumber 3 with y co-ordinate 7. There are various ways of overcoming thisproblem.

It must be borne in mind that the principal purpose of this system is topermit shapes to be defined for the purposes of video editing, colourcorrection and so forth. Typically an operator using a light pen,tablet, mouse or the like will select a few key points. The accuracy ofthese may not be pixel perfect and indeed there may be no absolutelyclear boundaries. It is therefore of little practical importance if, ina 256×256 pixel region, a point is one pixel or even more away from its"ideal" position. Accordingly, if a particular x or y location alreadyhas an entry the next location can be used. In the above example of thesquare, the result could be the quadrilateral--a parallelogram--of Table3. Obviously, at this scale the difference between the two shapes ismarked, but this is in a very small region for explanation purposesonly. Over a larger region and with shapes of more realistic sizes, thedifferences will be acceptable.

Another approach would be to repeat the locations in the tables. Thus, a256 location table could be split into two 128 location tables. Thus, xco-ordinate 2 would be at location 2 and at location 130. This could bedone as many times as desired, depending upon the maximum number oflocations available, the likely number of points that would be definedand the likely number of occurrences of a repeated co-ordinate.

Thus, there is provided a way of defining critical boundary points of acomplex shape using x and y co-ordinate tables. The system uses thecache tag ram technique referred to earlier, and in the preferred formuses the relative RAM described earlier. There are, of course other waysof storing such co-ordinates and these could be used even though theyare not the preferred methods. It would also be possible to use anotherco-ordinate system, i.e polar co-ordinates with r,θ values.

Once the critical points have been identified, the lines between thepoints, going in the correct order, are defined. These can be straightlines, calculated curves, parabolic fits or complex bicubic spline fits.There are many known ways of achieving this. The subsequent datastructure is then converted into a raster line structure, using knowntechniques and hardware, i.e. a raster image processor (RIP). The edgeswill now appear in sequential scanning lines. Thus, for each line therewill be a series of zero values but with a 1 at the appropriate pixellocation(s) where the line intersects the outline of the shape. Ofcourse some lines will not intersect the outline at all. It is thennecessary to "fill" the shape. There are known processes for doing this.

For example, starting with an assumed zero at pixel 1, line 1 one canincrement the pixel count until a 1 is encountered. Then fill with 1values until another original 1 is encountered, following which zerovalues will be entered. Of course, with complex shapes there may befurther intersections on the same line. This is a simple example of afill algorithm and more complex algorithms are known to one skilled inthe art.

The data defining the pixels which are within the shape is then storedin a suitable location such as a video frame buffer.

It may be desirable to blur the edges of the object. Thus a convolvingtechnique could be carried out using a filter mask such as an FIR(finite impulse response) filter, an IIR (infinite impulse response)filter or a laplacian filter. These are described in Platt, pages322-327. There will be produced "grey" values ranging from zero to allbit sets at a point well inside the object, through grey or intermediatevalues at the edges. It is important to note that the convolving takesplace on the data identifying the relevant pixels, and not on the actualimage data.

Ideally the hardware design allows for the reloading of the coefficientsof the laplacian or other filter on a frame by frame basis. This isbecause, depending upon the content of a given picture, the coefficientsmay change. For example, an object such as an automobile may move acrossa screen. In the centre of the screen the automobile may pass under theshadow of a tree. In this lighting condition the coefficients of thegiven mask may have to be radically different to produce a useable edge,compared with the coefficients that produce a good edge when theautomobile is well lit in strong sunshine. Another situation is whereobjects overlap. This may happen where there are two or more objects inthe picture that are to be tracked. This presents no problems ingeneral, as the DCP architecture is multi-channelled. However, thetracking of multiple objects is essentially a serial operation. Byutilising coefficients of different edge strength for each object, it ispossible to resolve priorities of objects which overlap. Thearchitecture allows the reloading of the mask coefficient at a rate atleast as fast as once per frame.

A further enhancement is the addition of a further processing modulewhich can analyse the image content of a given frame and find the edgesautomatically. Hardware devices are known which can extract images orproduce edge maps at reasonable data rates. This could be a backgroundprocess, and the taking of, say, half a second per frame is not aserious impediment.

One method of carrying out such a method is to perform the edgeextraction process only upon key frames. Many of these will be adjacentscene changes, so that they delineate scenes. Several frames willnormally also be specified as reference frames which are studied indetail for optimum transfer. Carrying out the edge extraction process onkey frames reduces the computational load, as compared to carrying itout on every frame. Intermediate frames between key frames can then betracked and the vertices followed from frame to frame. There are anumber of block matching techniques, as discussed for example, in Platt.The process leads to shapes being detected in the frames between the keyframes.

Of course, it is possible to track objects using the simpler rectangularregions discussed earlier. In general the technique is to identify theshape in a start frame, identify it in the finish frame, and then allowthe system to track the shape across the screen. It may be necessary toidentify the object in one or more intermediate frames. Where theappearance has changed, then this will be taken into account. Forexample a car may turn a corner, or may move rapidly into the distance,getting smaller. The system preferably generates automatically theapproximate shape and position for each of the frames between those onwhich the object has been marked. There are techniques for doing this.

In a practical system an operator may simply mark the object in thebeginning and end frames, and see how the results turn out. If they areunacceptable, then intermediate frames can be identified and the objectmarked out on these. In the very worst case, of course, every frame willhave to be analysed.

In FIG. 16 there are shown changes to the DCP system described above. Aswill be seen the synch signal is fed into a frame store. Into this maycome input from an operator, e.g. via a pen/tablet or a tracker ball orother device, and/or a signal from a motion vector analyser giving thex-offset and y-offset. From here the x-pixel address and y-pixel addressare fed to the table 15. The output from the system, for use inmanipulating objects, will generally be in the form of a key signal.

By using the techniques above, it is possible to identify a car which isof complex shape, to track it across the screen, and to alter the colourof a particular component. For example, red rear lights could be changedto orange or green.

In some cases it will be possible to identify an object exclusively bycolour alone. In other cases, it can be identified by position alone.However, by combining the two it is possible to track objects in complexsituations.

It will be appreciated that the systems described are not confined tosimply correcting colours. By identifying an object by means of whateverinformation is available and required to identify it exclusively, it ispossible to carry out many other manipulations. Thus the modification ofselected pixels may go beyond what would normally be classified ascolour correction. The changes made may amount to complete eradicationof an object and its replacement by another object or even just abackground.

Thus consider the situation in which a camera is fixed, pointing at astatic background. A car then drives steadily across the picture. Bypicking out the outline of the car on the first frame containing thecar, and on the last frame, it can be tracked. If there are two cars,one red and one green, then it is possible to ensure that the correctcar is tracked by specifying that pixels within the outline must be ofthe appropriate colour. Consider the red car only. It is possible tobuild up difference tags to find out which areas were red at one timeand not at a later or earlier time. It is therefore possible to replaceall of the red car pixels by background pixels as they were when the carwas not there. The information regarding the red car can be storedseparately. Thus, the red car has disappeared from one scene but couldbe pasted into another scene.

Where there is no record of part of the background, perhaps because ithas been obscured throughout the scene, then that part could be filledin with copied or interpolated background. On method is to interpolatespatially across the missing area, using systems varying from bilinearinterpolation to cubic B spline interpolation. A second method is tofill in the area with a low frequency area copy. This would involveselecting a neighbouring area and copying it to the missing region. Alow pass filtering function would then be performed on the adjacent datato remove disturbing discontinuities.

Using a digital video effects (DVE) system embedded within the system,it possible to carry out other effects. For example a removed objectcould be enlarged and placed back in the scene, or reduced and put backin the scene with background being added as necessary.

It would also be possible to change the time at which objects enter orleave a scene. Thus a car which enters a scene slightly too late can bemade to enter earlier. Normally this would involve bringing forward thedeparture time. However, by interpolation techniques the car could bepresent until the original departure point, i.e it could be made presentfor a longer time sequence than previously. This might result inunrealistic motion for major changes, but for the likely smalladjustments to be made there should be no problems. Adjustments to thesoundtrack would be made using existing audio editing technology such asthe LEXICON system.

Where the camera is not fixed but is for example carrying out a slow panacross the picture, then motion vector techniques could be applied toisolate the motion of the camera. This can be done before attempting todeal with an object such as a car or performer.

The system of removing an object or layer can be used repeatedly. Thusif a performer behind a lamppost and should be in front of it, thelamppost can be cut out, the performer cut out, the lamppost put backand then the performer put back. There will be some missing pixels forthe performer, where the lamppost was originally, but these can befilled in using interpolation techniques.

It is important to appreciate that having identified the object to bemodified, in a key frame, the operator only has to specify what is to bedone in that frame and it will be put into effect automatically for anumber of appropriate subsequent frames. It is not necessary to editeach frame individually.

A further use of the system is to take an object which has been shotagainst one background, and then superimpose it on another background orin another environment entirely, if desired in combination with othereffects such as zooming rotating and so forth. Currently, this isfrequently done by shooting an object such as a performer against achose special coloured background which is generally dark blue but couldbe orange or green. Such systems have been developed by Petro Vlahos ofthe Ultimatte Corporation. They have a number of drawbacks, includinglack of realism and coloured shadows around the images. It is necessaryto know in advance that a background change will be made, and to ensurethat effects such as wind and lighting are used which will match theeventual background.

In another video processing system for use in the system there isprovided a method of processing a video image obtained by scanningphotographic film frames, in which each frame on the film is scanned aplurality of times to produce a plurality of constituent frames eachcontaining only part of the data required to represent the image at arelatively high resolution, the constituent frames associated with eachfilm frame are assembled to provide a relatively high resolution videoimage, and the high resolution images are stored and/or displayed;wherein information is stored indicating the relationship between eachconstituent frame and its associated high resolution image, a processingdecision is made, and processing is carried out on one or more of theconstituent frames associated with that high resolution image.

The processing decision could be made on a high definition monitor, onthe basis of the high resolution image. However, it would be possible toscan the film an additional time--normally before the multiple scansreferred to above--to produce a standard resolution frame which isdisplayed on a lower definition monitor. The decisions would then bemade, following which the above procedure would be carried out toproduce the high resolution image. During this period an operator neednot be in attendance as the decisions have already been made in realtime using the standard definition scan and monitor. An advantage ofthis route is that it is not necessary to have a high definitionmonitor, thus reducing costs.

A significant advantage of such a system is that it is not necessary touse an expensive high resolution colour corrector or other processor.

The constituent frames may be obtained by dividing the film frame imageinto a plurality of sections and scanning each section. Each frame willcontain the full number of pixels required to produce a high resolutionimage of its section, but that number will of course be a fraction ofthe total number of pixels required to form the complete high resolutionimage, depending upon the number of sections which typically may befour. A typical arrangement would be to divide the picture into topleft, top right, bottom left and bottom right.

Alternatively, the constituent frames may be obtained by scanning acrossthe entire image but in an interleaved fashion. Thus one scan may takein odd lines and odd pixels; the next odd lines and even pixels; thenext even lines and odd pixels; and the final scan even lines and evenpixels.

The principal use for the system described above is in the field of highdefinition broadcast pictures having e.g. 1250, 1125 or 1050 lines asmentioned above. Thus a typical high resolution image will be 1800×1125pixels as opposed to a "standard" resolution of 700×500 pixels. However,another use is in the field of processing "film resolution" images.These may have 4000×3000 pixels, and instead of the relationship of say4 or 5 constituent frames there may be about 36 (6×6) constituent framesto one film quality frame.

A system as above described is disclosed in UK patent applixcation9320412.1 and a US application claiming priority therefrom. An outlineof the system is shown in FIG. 17. The underlying processor will be asdescribed with reference to FIG. 1 but the subsampler 3a andinterpolator 5a would be omitted.

It is often desired to track objects as they move between frames. Todeal with this in combination with the above described arrangement itwill be necessary to have an extension to the addressing logic in theframe correlator. In real terms, the car will move from the top left ofone high definition picture to the top right of the next. However, ifthe image is divided into four quarters, for example, the correlatorwill have to find the constituent frame for the top left quarter in oneframe and the top right in the next.

Although the present inventions have been described in relation toparticular embodiments thereof, many other variations and modificationsand other uses will become apparent to those skilled in the art.

I claim:
 1. A method of digitally processing a sequence of video frames,the sequence of video frames including a representation of an objectwhich appears in said sequence of frames and undergoes at least one ofrelative motion and transformation, the object being represented in eachof said frames by a plurality of pixels, said method comprising thesteps of:selecting said object in a first frame under the control of anoperator; locating, and tagging the pixels defining said object in saidfirst frame by means of information including at least one of a colourattribute and an appearance attribute; automatically locatingcorresponding pixels defining said object in subsequent frames of saidsequence by means of said information including at least one of a colourattribute and an appearance attribute and by means of informationindicating at least one of an expected position and an expected shape ofsaid object in said subsequent frames; and processing only the pixelsdefining said object in each of said frames and passing the unselectedpixels outside the object without any processing.
 2. The method of claim1, wherein the step of selecting said object in said first framecomprises the steps of:marking a plurality of points around said object;defining vectors joining said points so as to define a boundary aroundsaid object; carrying out a scanning operation so as to identify pixelswithin said boundary; and storing the locations of pixels identified bysaid scanning operation.
 3. The method of claim 1 wherein saidinformation indicating the expected position of said object is obtainedby marking said object in two spaced apart frames of said sequence andinterpolating to establish the expected position of said object inframes intermediate said two spaced apart frame.
 4. The method of claim1, 2 or 3 wherein the processing step comprises colour correction. 5.The method of claim 1, 2 or 3 wherein the processing step comprisesspatial manipulation of said object.
 6. A method of digitally processinga sequence of video frames, each of said frames comprising a pluralityof digital pixels, a number of which define an object that undergoes atleast one of relative motion and transformation in the course of saidsequence, the method comprising the steps of:receiving an input from anoperator, said input indicating a region of a first flame containingsaid object; locating and tagging, by means of information including atleast one of a colour attribute and an appearance attribute of saidpixels, those pixels within said region that define said object;automatically locating corresponding pixels defining said object insubsequent frames of said sequence by means of said informationincluding at least one of a colour attribute and an appearance attributeand by means of information indicating at least one of an expectedposition and an expected shape of said object in said subsequent flames;and processing only said pixels defining said object in each of saidframes and passing the unselected pixels outside the object without anyprocessing.